System and Method for Cyber Security Threat Detection

ABSTRACT

A cyber security threat detection system for one or more endpoints within a computing environment is disclosed. The system includes one or more collector engines. Each of the collector engines includes a service and an agent operating on a corresponding system endpoint of the system endpoints. The service is configured to take a first snapshot of the corresponding system endpoint. The first snapshot includes event activity information associated with the system endpoint. The agent is configured to take a second snapshot of the corresponding system endpoint. The second snapshot includes behavioral activity information associated with the corresponding system endpoint. The system further includes an aggregator engine configured to aggregate the first snapshot and the second snapshot from each of the system endpoints into an aggregated snapshot. The system further includes one or more analytics engines configured to: generate and store baseline profiles associated with the system endpoints based on a previously received aggregated snapshot, receive the aggregated snapshot from the aggregator engine, determine deviation values for each of the system endpoints based on the received aggregated snapshot and the stored baseline profiles, and generate, for each of the system endpoints, a cumulative risk value based on the deviation values. The system further includes one or more alerting engines configured to determine whether to issue one or more alerts indicating one or more security threats have occurred for each of the endpoints in response to the cumulative risk value.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.62/466,279 filed on Mar. 2, 2017, the disclosure of which isincorporated by reference herein.

FIELD OF THE INVENTION

Embodiments of the present invention relate generally to securitysystems. More particularly, embodiments of the invention relate tosystem and method for cyber security threat detection.

BACKGROUND

The significant growth in frequency and severity of cyber-attacks hashighlighted the failure of traditional security systems in combattingthe threat of modern and advanced cyber adversaries. Organizations areincreasingly recognizing the need for improved cybersecurity systems tocombat cyber-attacks and this is driving significant growth in thealready large cyber security industry which is predicted by marketanalysists to represent a US$170 B global market opportunity by 2020.Endpoint security is a specific problem within the cyber securityindustry and currently represents a US$20.9 B global market opportunity.

The Problem: Cyber Security—Growth in Frequency and Severity of CyberAttacks

Modern attackers have adopted new tactics, techniques and procedures tocircumvent the traditional security controls of organizations, leadingto a significant increase in the incidence and severity ofcyber-attacks. FIG. 1a shows bar graphs demonstrating the increase inthe number of global security incidents 102 between 2009 and 2015.Overall, this indicates a 61% CAGR in the number of security incidentsover that time. FIG. 1b shows bar graphs indicative of the average totalcost 104 of a single data breach for an organization in the USA between2013 and 2015. Note that this rising cost has close to a 10% CAGRproducing a $6.53 million average cost 106 in 2015 for just one databreach at a typical organization.

Although organizations have recognized the importance of preventingcyber attacks, their reliance on traditional security systems have leftthem vulnerable. Legacy security systems are ineffective at identifyinglegitimate threats and often produce large volumes of alerts which leadto false positives (normal or expected behaviors that are identified asanomalous or malicious). As such, IT administrators within organizationsdo not have the necessary resource (personnel) or computationalbandwidth to assess all alerts which often leads to legitimate threatsgoing undetected. As a result of ineffective flagging and detectionsystems, organizations at the present time are taking an average of 146days to detect a data breach. Whilst an initial breach on day 1 canresult in a minor security incident, the longer a breach remainsundetected the higher the chance of a major data breach.

A Specific and Major Problem: Endpoint Security Risk—Breaches at theEndpoint are a Significant Challenge for Organizations

The implementation of strong endpoint security is critical as endpoints(e.g. computers and mobile devices such as smartphones and tablets)provide the gateways through which users (and potential attackers) cangain access to highly sensitive corporate or government data. Most ofthe biggest data breaches, judged by the number of records beached orimportance of data stolen, have involved attackers leveraging stolenemployee credentials to gain access to secured networks via endpoints.The significant growth in Bring Your Own Device (‘BYOD’) and Internet ofThings (‘IoT’) have further compromised the endpoint security oforganizations as they no longer have control over the type or number ofendpoints devices available to an end user.

An organization's approach to endpoint security, and cyber securitythreats generally, can be broken down into two categories: a)prevention, and b) detection and response (comparable to a strongpreventative gate vs. an alarm system on a house). Traditional endpointprevention, detection and response systems rely on pre-determined threatindicators to block and detect specific threats, whereas modern cyberattacks are using advanced techniques to circumvent these pre-determinedcriteria. Despite the growing endpoint security threat, there remains afundamental difference between the way in which a hacker or an employeewould operate a particular endpoint.

Thus, there is a need for a behavioral based endpoint security solutionthat can detect anomalies in user behavior to accurately identify allthreats and breaches (regardless of the cause or effect) at endpointswithout the limitations of specific pre-determined criteria.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention are illustrated by way of example and notlimitation in the figures of the accompanying drawings in which likereferences indicate similar elements.

FIG. 1a shows bar graphs demonstrating the increase in the number ofglobal security incidents between 2009 and 2015.

FIG. 1b shows bar graphs indicative of the average total cost of a databreach for an organization in the USA between 2013 and 2015.

FIG. 2 shows an overview diagram of a system implementing in a corporatecomputing environment according to one embodiment of the invention.

FIG. 3 shows an overview of a collector and an aggregator implemented onan endpoint in a corporate computing environment according to oneembodiment of the invention.

FIG. 4 shows a dataflow diagram for a cloud service according to oneembodiment of the invention.

FIG. 5 shows a flow diagram of a process for threat detection accordingto one embodiment of the invention.

FIG. 6 shows a diagram of raw activity data on an endpoint.

FIG. 7 shows a graph indicating activity for specific threat types.

FIG. 8 shows an exemplary user interface (UI) dashboard according to oneembodiment of the invention.

FIG. 9 shows an exemplary diagram of software activity profilesaccording to one embodiment of the invention.

FIG. 10 is a block diagram of a data processing system, which may beused with one embodiment of the invention.

DETAILED DESCRIPTION

Various embodiments and aspects of the inventions will be described withreference to details discussed below, and the accompanying drawings willillustrate the various embodiments. The following description anddrawings are illustrative of the invention and are not to be construedas limiting the invention. Numerous specific details are described toprovide a thorough understanding of various embodiments of the presentinvention. However, in certain instances, well-known or conventionaldetails are not described in order to provide a concise discussion ofembodiments of the present inventions.

Reference in the specification to “one embodiment” or “an embodiment”means that a particular feature, structure, or characteristic describedin conjunction with the embodiment can be included in at least oneembodiment of the invention. The appearances of the phrase “in oneembodiment” in various places in the specification do not necessarilyall refer to the same embodiment. Random access refers to access(read/write) to a random offset of a file at least once during aread/write input/output operation.

Terminology

Aggregator—A technology that acts as a centralized connection target inenterprise networks where individual endpoints are not permitted toconnect to the Internet (usually) due to corporate policy. TheAggregator also simplifies integration with existing security,monitoring and alerting technologies, and reduces complexity duringimplementation.

API—Application Programming Interface allows applications and servicesto communicate without an interactive interface (e.g., graphical userinterface (GUI) or web browser).

Bot (Robot)—A Bot is a distributed technology used by attackers toautomate their activities on a large scale. Often assembled into“BotNets” or “Zombie Networks”, these are large groups of infected orbreached endpoints that are collectively used to do various attackeractivities. Examples include automated attack of non-infected endpoints;harvesting of private, sensitive or financial information (such ascredit card numbers or usernames/pas swords). Bots are identifiable bytheir behavior, which is highly automated and high speed. Keystrokestend to be consistently timed and error free, and mouse movements tendto be perfectly straight with very littleacceleration/deceleration/dwell in the movement. Bots are also heavyusers of keyboard shortcuts and CMD/terminal sessions to execute stringsof commands in sequence.

Collector—Endpoint technology functions as an operating system serviceand user agent, designed to snapshot and collect statistical eventinformation from the endpoint. Examples of statistical informationinclude number of central processing unit (CPU) processes, or size ofmemory footprint, or network data transmit and receive counters. TheCollector does not collect any private or sensitive data of any kind,and hashes (encrypts) collected data before delivery to the cloudanalytics services to further ensure no private or sensitive data isrecorded and stored.

Endpoint—Any device that is in the control of a user (employee/staffmember/contractor/etc.) and is used for the performance of organizationfunctions that require access to corporate systems. In many cases,endpoints are simply PC desktops and laptops, usually running Microsoftor Apple operating systems. Endpoints can also include mobile devices(smart phones and tablets), and Internet of Things (IoT) devices such ascameras and medical equipment.

Enterprise—An “enterprise” can be the entire organization/company, or itcan also mean a division/department/business unit inside theorganization/company. Therefore, a single enterprise is the collectionof users and endpoints that operate as a cohesive and integrated unit,either as the superset that is the entire organization, or a subset.Note also that the emphasis can also be placed on “inside”—many existingtechnologies are based on a design assumption that breach activity onlyoccurs inside the corporate network perimeter. Moreover, breaches mayhappen wherever the user and endpoint is operating, including out on theInternet, outside the protection of the traditional enterprise network.

Security Incident and Event Monitoring (SIEM)—SIEM is generally in theform of a large, expensive centralized system that takes disparateactivity logging sources (such as Active Directory, proxies, firewalls,etc.) and performs analytics in order to determine threats in acorporate computing environment.

Endpoint Access Behavioral Activity—The behavior of a user duringaccessing the system via an endpoint that comprises all user activityrelating to, but not limited to, firewall, IP address, activity counter,process info, down loads, keyboard, mouse etc.

Systems and methods are provided for collection and aggregation of rawstatistical data from system access endpoints to identify typicalbehavior of approved users and subsequently determine behavior changesindicating endpoint compromise. A data collector engine resides on eachphysical endpoint and captures user endpoint access behavioral activitydata such as for example firewall, IP address, activity counter, processinfo, keyboard connections and activations, mouse telemetry, and useractivity telemetry. Captured data is then securely sent to a cloud-basedanalysis platform to determine an approved user's behavioral profile (orfingerprint) that encompasses individual metrics, activity sequences,and comparative (historical) data. The behavioral profile is thencompared to future user activity to identify irregular behaviors, and ITadministrators are then alerted by reporting and alerting engines to anycredible potential threats. The cloud based analysis platform includesbehavioral and metrics analytics engines that use rules and learningsystems to differentiate between different users, including approvedusers, attackers, and malware. Also included is a User Interfacedashboard for handling alerts, and also a Prediction Engine to assist indiscovering threats and attackers through a process includingestablishing probabilistic trends for software activity and using thesetrends to determine abnormal activity.

The cyber security technology is disclosed that creates a profile or‘fingerprint’ of an authorized user based on the way they historicallyuse endpoint devices (for example through applications that areinstalled and used common to that particular user, keyboard timing(strokes or errors per minute) or mouse usage patterns) and comparesthat to what is actually occurring in real time to detect and flagpotential breaches. The profile is constructed using numerical(non-sensitive) data from multiple user-specific behavioral metricgroups. The cyber security technology can be integrated, for example,into an organization's existing security systems to enhance threatdetecting and flagging capabilities whilst reducing the risk of falsepositives.

User specific behavioral analysis extends beyond traditional threatdetection techniques—that are easily circumvented by moderncyber-adversaries—to accurately detect abnormal behaviors that areindicative of breaches. User behaviors may be observed over time andbaseline profiles are created. Baseline profiles are updated afteranalysis of captured data is complete, i.e. a profile is adjustedaccording to activities and metrics that represent changes in normalbehavior over time. The adjustment process is scaled (i.e. the newbehaviors must continue to exist over time in order to be built into theprofile. By updating the profile only based on continued activity overtime, and within other parameters (i.e. time of day, etc.), the riskthat attacker behaviors become part of a normal profile may be avoided.

According to one aspect of the invention, a cyber security threatdetection system for one or more endpoints within a computingenvironment is disclosed. The system includes one or more collectorengines. Each of the collector engines includes a service and an agentoperating on a corresponding system endpoint of the system endpoints.The service is configured to take a first snapshot of the correspondingsystem endpoint. The first snapshot includes event activity informationassociated with the system endpoint. The agent is configured to take asecond snapshot of the corresponding system endpoint. The secondsnapshot includes behavioral activity information associated with thecorresponding system endpoint. The system further includes an aggregatorengine configured to aggregate the first snapshot and the secondsnapshot from each of the system endpoints into an aggregated snapshot.The system further includes one or more analytics engines configured to:generate and store baseline profiles associated with the systemendpoints based on a previously received aggregated snapshot, receivethe aggregated snapshot from the aggregator engine, determine deviationvalues for each of the system endpoints based on the received aggregatedsnapshot and the stored baseline profiles, and generate, for each of thesystem endpoints, a cumulative risk value based on the deviation values.The system further includes one or more alerting engines configured todetermine whether to issue one or more alerts indicating one or moresecurity threats have occurred for each of the endpoints in response tothe cumulative risk value.

Behavioral Profile

One exemplary embodiment of the invention captures numerical data on atleast 7 exemplary user-specific behavioral metric groups to identify auser's behavioral profile—the unique way that a user interacts withtheir device. An example of the behavioral data collected from thekeyboard is key strokes per minute—see Table 1. This methodology can beextended to collect data on up to 75 or more metric groups.Additionally, proprietary algorithms may be used to overlay actual userendpoint activity with the expected behavioral profile to detect threatsand breaches. Such methodology may be integrated into a customerorganization's existing security systems to enhance breach detection andflagging capabilities. As the platform requires minimal CPU usage, itsimplementation will not impact the end user's experience at a customerorganization.

System Overview

When incorporated into a commercial or government enterprise computersystem, raw statistical data from computer endpoints may be collected inorder to analyze patterns and sequences that provide 3 key outcomes:

a) Identify and differentiate between users of the endpoint, withoutneeding to identify them by name/username/account (i.e. observe that“person x” is the usual user of the endpoint without ever knowing who“person x” is);

b) Identify changes in user behavior on the endpoint, that might suggestaccount or endpoint compromise; and

c) Reduce the average time taken to detect such compromises from thecurrent average of 146 days to a time more functionally useful(minutes/hours).

As shown in FIG. 2, which illustrates an overview diagram of a systemimplementing in a corporate computing environment according to oneembodiment of the invention. In FIG. 2, computing environment 202generally includes 10's or 100's of thousands of employees. A“Collector” service and agent 204 are installed on the endpoint. TheCollector 204 gathers statistical information, and produces a secureddata “bundle” which it sends to the Aggregator 206 (in large corporateenvironments) or directly to the Cloud Service API 208 (if it is astandalone endpoint 210 on the Internet).

Once the data is received by Cloud Service 212, it is unpacked, storedin storage 214 (e.g., random access memory (RAM), hard disk, or solidstate drive (SSD)) and passed through the behavioral and metricsanalytics engines 216. Analytics engines 216 contain the behavioralanalytics rules and learning systems that can differentiate between theactivity of different users, including attackers and malware attemptingto emulate valid activity. Analytics engines 216 also provide the outputrequired by the reporting and alerting engines 218 to update status andescalate observed potential threats for further investigation.Behavioral and analytics rules are different from one another.Behavioral rules look at context in the activity, while analytics rulesare more statistical and arbitrary. For example, a behavioral rule canlook at the probability that a sequence of events occurs using only thefirst/2/3/4 events as the starting point. An analytic rule may look atthe metric directly, and ask (for example) “is the CPU load currentlywithin acceptable tolerances given what is currently running?”. Both ofthese are overlaid with a learning system that takes the manual/definedrules and supervises a learning process so that such analysis can bemore usefully automated and developed/evolved.

Rules

A related analytic rule might ask: is the CPU load appropriate given thenumber and types of processes operating in comparison to the expectedload that is “normal” (i.e. baseline) for this user. This means thereare no set thresholds, and instead the rule looks to identify howextreme deviations are from “normal”. If CPU is normally 5%, then movingto 7% may be a significant deviation for one endpoint, but minimal or nodeviation for a different endpoint. Generally, no arbitrary thresholdsare ever set. Each endpoint has baselines and measurements taken basedon deviations. Since the system can operate with no initial valuesdepending on how active the endpoint is, there can be a period ofdelayed detection (not “protection”) while the service sets thebaseline, which may require a few days of reasonable activity. Overall,the methods are implemented to measure deviations at a metric, context,sequence, and profile level (i.e. deviations on the endpoint itself),and then across endpoints in a single company/enterprise, and thenacross all endpoints being monitored.

In general, a behavioral rule can be any sequence or grouping ofanalyzed contexts that demonstrate sufficient/significant deviation fromexpected “normal” baselines. Therefore, a sequence or group of eventsmight contain for example 15 metrics or “triggers”. Any sub-sequencewithin the group can be sufficient to increase the probability ofabnormal behavior. For example, of 15 metrics in a sequence/group, onone endpoint there might be deviations on metrics contexts 8-12, whileon another endpoint there might be deviations detected on metriccontexts 3-7. The same behavioral rule is applied, but the probabilitiesand therefore prediction differ per endpoint in the application of therule. Note that these operations have different parameters from endpointto endpoint.

Learning Systems

While the learning systems can operate using conventional SupervisedMachine Learning (SML) or Machine Learning (ML) techniques, in oneembodiment, the learning systems do not use the existence of metrics ortheir values to identify breaches (since that is how a signature-basedsystem operates). Instead, the learning systems identify the “switch”between metrics, as they change context, and compare against a baselineof predicted metric behavior.

Assume that there is an analytic sequence (or series of contexts) thatcould be analyzed effectively by the ML. For example, objects thatinclude one or more metrics may be provided to the ML as inputs of ananalytic sequence. In one embodiment, one or more outputs of theanalytic sequence may form objects that are provided to the ML as inputsof another analytic sequence.

Every metric can be an initial/starting point for analysis. Each ofthose metrics can have specific parameters, generally relating to anacceptable range (or scale). For example, “CPU load” can only havevalues from 0-100, while “Memory Footprint” may have values from0-<unknown>, with a “reasonable” or expected range of 30% to 100%.Similarly, the number of processes running can be 0-<unknown>, and thereare no specific parameters for what the values (i.e. names) of thoseprocesses are.

These parameters may be important because (for supervised learning to beapplicable) there needs to be sufficient training data to represent thepossible combinations of all context switches, and not simply theexistence of the metrics themselves. While more complex to model, thisapproach can produce desirable results. It may also be possible to usesome variations of reinforcement learning, dimensionality reduction andanomaly detection, and potentially ignore supervised and unsupervised MLcompletely for the core analytics.

Note that if a malware agent is already installed on an endpoint whensoftware in accordance with aspects of the invention is first installed,it may initially create a false baseline. Activity related to existingmalware may initially appear as “normal” if it is continually active. Insuch cases, it can take longer to profile and the end result is likelythat it will appear as though multiple users are on the endpoint. Sometypes of malware are naturally very stealthy (particularly those thatwait for instructions from “command and control” systems). These willstill be detected, when they suddenly wake up and become active.

Collector/Aggregator

FIG. 3 shows an overview of a collector and an aggregator implemented onan endpoint in a corporate computing environment according to oneembodiment of the invention. As shown in FIG. 3, the Collector 204 maybe a combination of Collector Service 302 (e.g., operating systemservice/daemon) and Collector Agent 304. The Collector Service 302receives activity events directly from the operating system, and alsotakes regular snapshots of hardware-related statistics (e.g., resourceinformation), such as CPU, memory, GPU, disk, Network, universal serialbus (USB), Bluetooth, Display status, and other hardware connections.The Collector Agent 304 allows snapshots to be taken of other activitythat occurs in “user space” such as mouse and keyboard activity. Thesnapshot timing varies depending on the type of data being collected.Activity for highly volatile systems (such as a CPU) might be capturedevery few seconds, while less active subsystems systems such asBluetooth might only have data collected every 30 seconds. Event datasent directly from the operating system is not linked to a timer, and isreceived by the Collector Agent when the events occur.

One goal is that these services and agents are very lightweight, for oneexemplary implementation requiring only 20 MB of memory to run and usingless than 2% of CPU runtime during the milliseconds required each timeto gather the required data.

The statistics collected by the Service 302 and Agent 304 may berepresented by the metrics (as previously described). The metrics storedin a small local data store 306 temporarily until a bundle can be built.Each data bundle is sent either to the internal Aggregator or direct tothe Cloud Service. The data bundles are highly compressed 308 as thedata is statistical only and then encrypted. The data is thentransmitted 310 to the Analytics engines on the Cloud Service 212 toidentify anomalous behaviors suggesting a breach.

The Aggregator 206 is simply a data handling platform required by manylarge corporate environments to reduce complexity and assist withenforcing corporate policies that prevent endpoints from accessing theInternet directly. In these cases, the company can permit the aggregatorto access the Internet and act as the intermediary between the endpointCollectors 204 and the Cloud Service API 208.

The Aggregator also performs data management functions relating toalerting and reporting, specifically to reduce the need for internalsystems to access the Internet directly. Using the Aggregator allowsnetwork and security operations personnel to access reportinginformation without needing to significantly modify their existingsystems.

Cloud Service Data Flow

FIG. 4 shows a dataflow diagram for a cloud service according to oneembodiment of the invention. Referring to FIG. 4, the cloud serviceprovides the ability to deeply analyze disparate data sets, identifypatterns and behaviors not readily visible inside a single enterprise,and remove the processing load from the endpoint to avoid impacting userexperience. The enterprise can be the entire organization/company, or itcan also refer to a division, department, or business unit inside theorganization/company. Therefore, a single enterprise is the collectionof users and endpoints that operate as a cohesive and integrated unit,either as the superset that is the entire organization, or a subset.Note also that the emphasis can also be placed on “inside”—many existingtechnologies are based on a design assumption that breach activity onlyoccurs inside the corporate network perimeter. It is also taken intoaccount that breaches happen wherever the user and endpoint isoperating, including out on the Internet, outside the protection of thetraditional enterprise network. Also, by residing in the Cloud and notwithin the enterprise system itself, cloud service and analytics engines216 are more immune to malicious attack.

Data bundles from the Collectors are received by the Cloud Service API402, unpacked and verified 404 (and quarantined/alerted if issues areidentified) before the data is stored in analytics storage 406 (e.g.,RAM, hard disk, SSD) and forwarded into the analytics engines 216 viathe Profile Management Handler (PMH) 408. The PMH simply matcheshistorical data with the current data to enhance the analyticseffectiveness (i.e. individual data bundles are not useful withouthistorical context). An easy test for each data bundle received is acheck to see if it is within tolerances for the historical profile (i.e.is it significantly different from expected). The data bundles are verysmall, so continually matching received data against stored profile dataenhances analytics because it reduces processing load and time onceactual analytics starts. Also, it assists with verifying consistency(probability) and also being an early warning to possible integrity orerror issues.

Data is then delivered to the analytics engines that have 3 corefunctions individually, and jointly:

Individual Metrics 410 are sanity checked. It may be possible toidentify unauthorized activity or behavior indicating compromise wherethe activity is not particularly obfuscated or subtle. For example,known malware executables can be identified at this stage. Refer toTable 1 for metrics examples.

Historical and Cross Endpoint (EP) Comparatives 412 use historical datacombined with analytics from all endpoints to identify patterns that mayexist across the enterprise populations.

Activity Sequences 414 specifically look for valid and authorizedbehaviors that individually are not an indicator of compromise. However,when contextually joined together into sequence combinations, andanalyzed in combination with other metrics (time/timing, zone, and theABSENCE of specific metrics that would suggest the right user is notpresent), it is possible to clearly separate and prioritize behaviorsthat warrant further logging, monitoring and flag for potentialescalation. Detecting the absence of a metric is important becausecurrent intrusion detection technologies find it difficult assess riskthrough missing data. In one embodiment, a Prediction Engine may beincluded to predict—with a determined probability—valid softwareactivity through usual prior activity. The most common absences aretechnologies implemented by the operating system or other 3rd partyvendors that are present all the time in normal operation and suddenlydisappear. For example, if Windows Defender, or a 3rd party anti-virustechnology, is notably absent (disabled/not operating) while othercontextual metrics are flagged as deviating from normal, the absence ofa commonly/usually present metric increases the risk associated with thecontextual sequence and abnormal behavior probability prediction.

Once the results are produced by the analytics engines, they arecompared against a variety of tests 416 that include:

a) Repetition tests—determine if have these behaviors been seen before.

b) Time and timing tests—determine what time of the day an activity isoccurring, and the timing of the activity (i.e. how fast/slow it isoccurring, and over what period of time).

c) Zone tests—determine where the behavior is occurring, from networkzone through to geographic zone.

Alerting and Escalated Alerting

The Alerting Engine 420 is responsible for injecting alerts directlyinto the existing enterprise alert management or ticketing systems.Alerts are not delivered to the individual endpoint/user (who could bean attacker), although alerts are made available on the ReportingDashboard and can be easily delivered to security administrators using amethod and format the customer prefers (for example: email or text). Ifan endpoint is potentially being used by a real person who is notauthorized, or by a real person who is authorized but behavingabnormally, it is important that the particular endpoint not receive analert, and that the enterprise administrator responsible for security isalerted as soon as possible.

The alerting and reporting engine designs are also leveraged toimplement escalated alerting. Other technologies usually only usegeneric thresholds or tolerances to generate alerts, much like legacysignature detection systems (i.e. “all or nothing”). The effectivenessof Security Incident and Event Monitoring (STEM) systems can be impactedby poor (or no) tuning. The failure lies in not building these systemsto dynamically adapt their ability to capture and alert in an effectiveway in an automated fashion using the data provided, that producesuseful alerts to network and security administrators.

Escalated alerting design is based on the concept of “additiveindicators of breach”. Tied directly into the design of the ActivitySequences, it is possible to identify varying trigger types andassociated level of significance (risk) in order to determine the levelof alert that should be generated. Consider the following BehavioralSequence example from Behavior Sequence Example 2, where each risktrigger adds an additional risk level (level of significance) to acumulative risk level for the activity sequence/grouping:

A user wakes up their computer (endpoint) from sleep/screensaver using adifferent sequence of key press or mouse movement than expected (triggerB1). They log in to the endpoint using valid credentials but the timingof keypresses along with mouse movement, while valid, is different toexpected (trigger B2). The user then plugs in a USB device, that isvalid but infrequent (trigger B3), and then starts typing at a speedimpossible for a human (trigger B4) with perfect accuracy (trigger B5).Their typing includes opening of infrequent or never before usedapplications (trigger B6) that were called by keyboard shortcuts ratherthan mouse selections when the user historically uses only the mouse toopen apps (trigger B7). Those opened applications commence attempting toconnect to external (web) systems (trigger B8) that are not in the sameGeoIP range as the endpoint (trigger B9) as well as internal hosts by IPaddress sequence (like a port scan—trigger B10), resulting in a changein port maps, network traffic ratio, disk and CPU utilization (triggersB11, B12, B13 and B14).

The above example would look valid to the vast majority of endpointbreach detection technologies, particularly since the above could occurwithout malware involved.

Centralized SIEM systems may detect parts of the activity (such as GeoIPconnections or scanning of the internal network) from event logsproduced by the operating system or monitoring of centralized networkswitch equipment. However, based on an incomplete picture, these systemswould either alert with insufficient information (creating noise orfalse positives), or not alert at all. Contextual activity may betracked over a period of time, and adds a risk value for eachincremental risk level traversed to the cumulative total risk leveldetermined previously. Some incremental risk levels would also beweighted more heavily than others (for example, trigger B4 in thisexample is more significant than trigger B7).

While it is possible to set weightings as static values, in oneembodiment, weights may be dynamically determined based on at least twomain criteria—frequency in prior history (ie has it ever seen before onthis endpoint, and how many times); and existence/frequency within thesame organization (i.e. are other endpoints in the same networkexperiencing the same/similar activity. It is possible to weight somecategories higher than others. For example, a metric indicating thatanti-virus services have been disabled is more interesting (and a higherrisk) than observing a change in network traffic volume.

Since administrators historically are not especially skilled atquantifying risk, incremental risk levels and weighting are preferred tobe learned over time, based on contextual sequences and relative to a“normal” baseline. In the preferred embodiment, there are no pre-setthresholds with respect to each category of trigger event. For eachcategory of trigger event (each context), a baseline is continuallyestablished and dynamically updated over time. Excursions are observedwith respect to the baseline, and it is abnormal excursions that cause atrigger. One embodiment looks for excursions of, for example, 1 or 2standard deviations in either direction. However, for a preferredembodiment, variances are compared across a sequence or group ofcontexts, and are not relied on individually. Therefore, a singlecontext “step” that moves by more than 1-2 standard deviations may notbe enough to trigger unless there are also notable variances in other“steps” in the sequence. In this way, the variances are not aggregatedlike traditional systems, but are more accurately defined as“dependencies” where multiple variances will be required. In a furtherembodiment, variances/deviations must also occur in a particular orderof context switches to trigger an alert. This will also have the effectof further reducing false positives.

For one embodiment, a function of the Alerting Engine may includereceiving authorized user confirmation that the detected activity isactually valid. Examples include cases where an authorized user has asupport resource operating their endpoint for support purposes, or wherethe authorized user has an injury that might change their keyboard ormouse styles. In such cases, the activity will still continue to betracked and the endpoint would appear on the enterprise securitydashboard, but no alert would be sent if the authorized usersuccessfully answers a challenge sent “out of band” (OoB). Suchchallenges can be by SMS (text, etc.), or using a phone-basedauthentication system (like Google Authenticator or similar). If theuser successfully enters the OoB challenge, then monitoring wouldcontinue but alerting would not occur. If the challenge is failed, ornot entered at all, an alert would be generated immediately above allother metrics. For a preferred embodiment of this function alerts aresent to the user, but via the reporting and alerting dashboard wheresecurity administrators are involved, in order to avoid alerting anattacker or allowing an attacker to continue destructive activity.

Reporting

Reporting is critical as Early Alerting of an organization to a newthreat may prevent major damage. Some threats nibble-away at anorganization over long periods of time while others, if allowed, captureor corrupt sensitive information or inflict financial damage in veryshort time periods.

The reporting engine 422 is designed to produce output in predefinedformats that can be delivered to existing enterprise monitoring andreporting systems, in addition to providing a direct cloud-basedreporting dashboard for clients without existing systems.

Dashboard reporting can be basic or advanced. Basic reporting allows forinvestigation of events on specific Collectors while also highlightinggeneral statistics about metrics and alerts. Advanced reporting allowscompanies to investigate events and alerts all the way to source data,as well as review historical analysis.

Examples of Behavioral Activity Groupings

Behavioral analytics and detection technology identifies groups orsequences of valid events or activities that, in isolation, areauthorized and permitted. However, when performed together in particularways can represent a security incident or breach.

Embodiments according to the invention are not intended to be areplacement for legacy or existing technologies such as anti-malware orSIEM, which already serve an important function inside corporatenetworks. Instead, an additional technology is provided, specificallyaddressing a design and implementation gap—breaches from insiders(disgruntled or ex-employees), and external actors with stolencredentials, are difficult to detect when their only activities mirroror mimic the real and authorized users.

The examples outlined below are real scenarios that have either beenused in successful breaches, or used by security testing consultants togain access to networks and test defenses. These methods are generallysuccessful without detection by most existing and legacy securitytechnologies. This is not a complete list, but are representative of thetypes of attacks that have the highest rate of success. Depending on theweightings of each trigger event within a grouping, an alert may betriggered without all the events in a group occurring if the triggerevents that cause the alert are weighted heavily enough. For example, ifa sub-group of events, say 3 or 4 out the of 8 trigger events in agroup, are weighted much heavily than the others in the group, an alertmay be generated by variability within just this sub-group of events.

As discussed above, each of the events in an access scenario by users isprovided risk weightage based on their importance by the analyticsengines. The cumulative weight of events, that are dependencies and showvariances from the normal finger print of the user, includingdeletion/omission, addition and changes, are considered trigger eventsand are used for enabling a trigger initiation by the analytics engine.When the total weight (cumulative total risk level) of the dependenciesidentified cross an alert level risk threshold, intimation is providedto reporting/alerting engines for threat alert generation. Note that itis preferred that the alert level risk threshold is established by alearning process over time based on tracked excursions of the cumulativetotal risk level.

Behavioral Activity Generating Trigger Group 1 [Example 1]—RemoteDesktop Connection with Stolen Credentials:

The scenario—Attackers, using stolen credentials obtained from phishingattacks or through dictionary/brute forcing password guessing, will scancorporate external perimeters looking for remote access gateways. Onceidentified, access is generally straight forward.

Attacker establishes remote desktop connection with target gateway andis presented with login screen (trigger A1, new source address).Attacker may move mouse to verify connection or put the cursor in theusername field (trigger A2). Attacker types stolen username/passworddifferently to the real user (trigger A3) or copy/pastes credentials(trigger A4). Once logged into remote desktop, attacker will usuallyadopt keyboard shortcuts which are faster than mouse movements (triggerA5), or will open CMD/terminal (trigger A6) to execute scripts (triggerA7) that were created using copy/paste (trigger A8) onto a local hiddenor temporary directory (trigger 9). Attacker may also download tools(trigger A10) and install tools to ensure endpoint remains accessibleacross reboots (trigger A11).

From here, the attacker's movements, laterally across the network, cantrigger several metrics depending on their objective. Data theft canchange the network and disk profiles; software installation can changethe firewall, network and connection tables as well as disk, memory andCPU profiles; scanning or connections to internal systems can change thetransmit/receive ratios and connection tables. Mouse and keyboardanalytics would also highlight issues where the attacker is manuallyoperating the endpoint, or the endpoint is being controlled by anautomated system (bot) that does not behave naturally (large number oftriggers, and trigger sequences).

Behavioral Activity Generating Trigger Group 2 [Example 2]—InsiderAccess with Stolen Credentials, or Endpoint Left Unlocked:

The scenario—A user (disgruntled, bribed, contractor) wishes to eithersteal sensitive data or deliver malware that allows remote connection ata later time. The user does not want to be detected and therefore usesstolen credentials, or takes advantage of an unprotected unlockedendpoint.

A user wakes up the computer (endpoint) from sleep/screensaver using adifferent sequence of key presses or mouse movement than expected(trigger B1). They log in to the endpoint using valid credentials butthe timing of keypresses along with mouse movement, while valid, isdifferent to expected (trigger B2). Or, identifying an unlockedendpoint, user moves the mouse differently to the real user (trigger B3)or uses keyboard shortcuts to open known menus or applications (triggerB4). Unfamiliar with desktop layout or available applications,unauthorized users will either use the mouse to browse menu options(trigger B 5); use the search function to quickly locate specificapplications (trigger B6); or open a CMD/terminal to directly access thefilesystem (trigger B7).

Similar to Example 1, the attacker's movements, laterally across thenetwork, can trigger a number of metrics depending on their objective.Data theft can change the network and disk profiles; softwareinstallation can change the firewall, network and connection tables aswell as disk, memory and CPU profiles; scanning or connections tointernal systems can change the transmit/receive ratios and connectiontables (large number of triggers, and trigger sequences).

Behavioral Activity Generating Trigger Group 3 [Example 3]—USB DeviceInserted and Subsequent Commands

The scenario—A user (disgruntled, bribed, contractor) wishes to eithersteal sensitive data or deliver malware that allows remote connection ata later time but has limited time and therefore automates the attackusing a USB device. Alternatively, user finds a USB device in thecarpark and decides it can't hurt to test on the work computer.

A user wakes up the computer (endpoint) from sleep/screensaver using adifferent sequence of key press or mouse movement than expected (triggerC1). They log in to the endpoint using valid credentials but the timingof keypresses along with mouse movement, while valid, is different toexpected (trigger C2). The user then plugs in a USB device, that isvalid action but infrequent (trigger C3), and then starts typing at aspeed impossible for a human (trigger C4) with perfect accuracy (triggerC5). Their typing includes opening of infrequent or never before usedapplications (trigger C6) that were called by keyboard shortcuts ratherthan mouse selections when the user historically uses only the mouse toopen apps (trigger C7). Those opened applications commence attempting toconnect to external (web) systems (trigger C8) that are not in the sameGeoIP range as the endpoint (trigger C9) as well as internal hosts by IPaddress sequence (like a port scan—trigger C10), resulting in a changein portmap, network traffic ratio, disk and CPU utilization (triggersC11, C12, C13 and C14).

Behavioral Activity Generating Trigger Group 4 [Example 4]—Attach toPrivileged Process with Reverse Shell Access

The Scenario—Most common with remote attacks, the attacker is highlymotivated to establish persistence on the target endpoint, so that theycan continue to gain access in future. Attaching to a privileged processallows the attacker to still connect to an endpoint even after it hasbeen rebooted.

Generally, the easiest way for an attacker to gain remote access to anendpoint is through phishing or similar attacks. Other options alsoexist, and the end result is the same—the attacker is able to get aremote connection to the endpoint. Detected triggers are likely providedin the initial stages of the phishing attack (such as the installationof malware that results in automated installation of tools/droppers andautomated (bot) connections back to the breached endpoint). This examplecovers the scenario after the endpoint has initially been remotelybreached and the attacker connects for the first time (i.e. it is notsimply a malware infected endpoint operating as a bot).

Initial connections in this scenario are unlikely to be graphical (i.e.not remote desktop). Attackers will have “shell” access, which iscommand-based access to enter instructions (i.e. typing only, with onlytext as the interface). Depending on the shell access method used(trigger D1), visibility of the commands being entered may or may not beprovided (and therefore of the keyboard metrics—trigger D2 if they arevisible, trigger D3 if the keystrokes are not visible butuser-interactive commands are being executed). The attacker willinitiate a series of activities to identify the endpoint (trigger D4),understand the filesystem layout (trigger D5), identify networkconnections to file servers and other potential targets (trigger D6) andidentify privilege level of the user they are pretending to be (triggerD7).

The attacker can then undertake a variety of very small activities thatwill take advantage of a weakness or vulnerability in the endpoint(trigger D8). This is common if the endpoint is not fully patched andupdated (operating system) or has unpatched applications installed.Exploiting these vulnerabilities may require small exploit tools to beuploaded to a local temporary directory (trigger D9) or custom codewritten on the endpoint to be executed (trigger D10) usually afterdownloading code or content from external sources without using abrowser (trigger D11).

Execution of the attacker's code will exploit the vulnerability andallow the attacker to attach their code to a privileged process. Thiscode most often takes the form of a “reverse shell”. These areCMD/terminal sessions where the breached endpoint makes a new outboundconnection to the attacker's Bot network every time the endpoint isrebooted. Such a process changes various metrics on the endpoint,including network (connection) tables, network activity ratios, CPU andmemory footprints, etc. There are numerous triggers that would combinein this scenario to identify a particularly stealthy attacker who hasnot done anything that would be detected by existing anti-malware orother security systems.

This example is broadly similar to the challenge given to a securitytester when evaluating detection technology. Embodiments according tothe invention would detect the attacker as they exploited the endpointand attacked other endpoints, while legacy technologies would not.

Individual Metrics

Metrics may be defined by a broad category, and may be identifiablewithin a given platform or operating system on an endpoint, or are acombination of values taken from a variety of sources. The exemplary andnon-limiting list below highlights some of the common categories andmetric types, but has been generalized and is not to be taken as acomplete source or reference.

TABLE 1 Examples of Metrics Category Metric Firewall Application;direction; IP address and port; sub-application (child and thread)Process Info Process name; process ID (PID); operator; volatility;process (CPU) creation and destruction events GPU Load; display state;duration Memory Memory map and size; volatility IP Address Local addressand port; remote address and port; session (Network) state; relatedprocess ID (PID); GeoIP of remote addresses; resolved domains withhistorical tracking Activity Counter (Network) Packets sent; bytes sent;bytes received; bytes total Port Info Port count; session states;inbound/outbound ratio; (Network) GeoIP of source/destination;volatility User Account name; user flags; operator Groups & MembershipGroup names; member name; member group name Applications-privilegedState; in use; Applications-user State; in use; Keyboard Keys perminute; backspace/delete per minute; control keys per minute; timingbetween keys; timing between common and defined sequences Mouse Movementduration; movement commencement duration; movement accelerationduration; movement deceleration duration; movement dwell before stop;movement dwell before button press; duration between button press;button presses per minute; Display State; duration USB State; activityBluetooth State; activity Other hardware State; activity

Overall Process Flow

FIG. 5 shows a flow diagram of a process for threat detection accordingto one embodiment of the invention. In step S502, activity informationis collected on one or more endpoints. In step 504 the collectedactivity information is aggregated, and optionally transferred to thecloud (e.g., over a network) in order to be processed in a more secureenvironment. In step S506, the aggregated information is analyzed usingrules and learning systems to differentiate between normal and abnormalactivity. In step S508, for each specified category of threat, activitydeviations are compared within the category to one or more riskthresholds, and a weighted risk value is assigned for each thresholdtraversed, to create a cumulative risk value for activity over allthreat categories within a specified time period. In step S510, an alertlevel is assigned to activity in response to the cumulative risk valuebeing compared with an alert level risk threshold or alternately inresponse to a number of risk thresholds traversed. Last, in step S512one or more security administrators are alerted according to the alertlevel.

Critical, time sensitive alerts to threats that may place the computersystem in immediate jeopardy are annotated or provided in such a way asto guarantee system administrators are immediately alerted. For example,a text message may be sent to responsible system administrators from aspecific phone number such that an administrator cellular phone can beprogrammed to respond with a specific and unique tone when an extremelytime critical alert is generated.

User Interface Dashboard

The user interface dashboard is available to security administrators forcomputer systems where Collector 204 is installed, and provides controlof security processes according to aspects of the invention as well as avariety of displays of analytical information and status. Facilities areincluded to alert security administrators to alerts of varyingpriorities as well as prominently displaying alerts for high priorityidentify threats. User activity at endpoints can be viewed in the userinterface and a variety of formats. FIG. 6 shows a graph of raw activitydata that has been scaled and normalized where amplitudes for metrics602 are shown over time 604. For example, only three metrics are shownfor clarity. These include firewall activity 610, keyboard activity 608and mouse activity 606. Many additional metrics may also be shown insuch a graph including for example: network activity, IP addressactivity, disk activity, etc. See table 1 for a more comprehensive listof metrics.

After analysis is been performed, a display such as for example that ofFIG. 7 can be viewed. Again, metrics 702 are shown over time 704, exceptin the graph of FIG. 7 the same metrics shown in raw form for FIG. 6 arenow shown as abnormal activity spikes that indicate a departure from anormalized profile background. Spikes 706, 708, and 710 show abnormalactivity spikes for mouse, keyboard, and firewall activity respectively.

FIG. 8 shows an exemplary user interface (UI) dashboard according to oneembodiment of the invention. Referring to FIG. 8, in the upper left, aresponsible security administrator has signed in and is identified 802.The dashboard may have a multitude of functional modes 804 including:general management of the dashboard; setting up notification; recentactivity display; and assignment of dashboard presets. Other functions806 may be controlled from the dashboard including for example andwithout limitation data traffic display, analysis display, mailboxaccess, and display, control, and disposition of alerts. At the top ofthis exemplary display, data traffic 808 is shown. In the center of thedisplay of FIG. 8, an activity graphic 812 is displayed which in thiscase has been chosen to display the graph containing abnormal activityspikes as previously shown in FIG. 7. At the right of the display a listof alerts 814 is shown, and under the alerts a summary of statistics 816is shown. At the bottom of the display recent activity 818 is shown, andalso a progress report 820 is shown indicating that a security check iscurrently 34% complete.

Prediction Engine and Probability Profiles

Endpoints (computers/mobile devices/etc.) are constrained by thetechnical sequence that is followed when an application program orprocess starts. Any software application that runs (such as MSWord.exe)must execute several predefined steps in a particular process sequenceto function correctly. Along the way, a software process will typicallytouch certain files or involve other applications. Even malware andattackers are constrained by the technical sequence that endpointhardware and operating systems must follow for applications to function.One embodiment of the invention, herein called the Prediction Engine,identifies when attackers and/or malware attempt to manipulate such aprocess. The Prediction Engine predicts the steps that an applicationwould be expected to usually follow—with a determined probability that aprocess should follow the steps—in the context of the user who normallyoperates/owns the endpoint. When applications diverge from theirpredicted sequences, the probability of a possible breach increases.Thus, the Prediction Engine analyzes the effects of user behaviors at anendpoint as well as software execution sequences.

There are scenarios where attacker code or malware can “attach”themselves to an existing process, after the existing process hasalready passed controls that might exist, such aspermission/authorization checks, or establishing a connection with theInternet. This can be viewed as malware or attacker code “hitching aride” and taking advantage of the authorized application letting thempass through the usual control/security gates.

If a user launches MSWord.exe regularly, there is a pattern that isfollowed that is recorded by the Prediction Engine. Using that patternthat is seen multiple times in the past tells us that running MSWord.exeis normal for that user, and the associated probability of that patternis normal. The steps the application follows when starting are wellknown and well defined. Using this data, the Prediction Engine predictsthe steps MSWord.exe is likely to take each time it runs in future, withhigh probability, and therefore high confidence.

When an application is run, but does not follow the predicted series ofsteps that the established behavioral model expects, then theprobability that this is the same application as seen previously islower than expected, and therefore may warrant further analysis or analert to be created. Divergent behavior increases the probability thatthe application in question has been modified or manipulated, such aswhen the application has not been patched and contains vulnerabilitiesthat attackers can exploit. This probability is a metric of its own, andcan be combined with other metrics to derive an overall risk weighting.Also, current technologies find it difficult to assess risk throughmissing data. The Prediction Engine predicts valid activity sequencesthrough observation of usual prior activity, and also determines apotential breach by discovering an ABSENSE of activity that with highprobability would normally exist.

The graph of FIG. 9 shows expected activity sequences for given softwareprocesses that behave as predicted (with high probability) while adivergent process that did not follow the predicted path is highlighted.While the divergent path does not necessarily indicate a breach, it doesindicate anomalous behavior that may warrant further investigation,collection of additional metrics, or generating an alert. Analysis maydetermine to what degree the diversion path is abnormal, and assign aweighted threat level to each successive diversion from the mostprobable path. In this particular example, programs typically start at aroot directory 902 such as C:, and then propagate from particularprimary directories such as program files (x86) 904, Windows 906,Program Files 908, and Users 910. From there, programs propagate inrelatively predictable matters with a probability that is determinedover time by the prediction engine, operating various applications 914such as for instance Microsoft Silverlight and Windows Defender. Someprocesses will touch 912 different directories or engage with otherapplications along the way, but typically do so in a predictable mannerwith an associated probability.

Occasionally a software process such as 916 may divert from apredictable path at a juncture such as 918 were a software distributionfunction is invoked later causing another diversion where download 920occurs resulting in installation 922, which in fact may represent aninjection of malware code into the system. A threat level is establishedin a probabilistic manner according to an amount of diversion from apredictable path, as well as the category of functionality that isrepresented by the diversion. Subsequently alerts are generated inresponse to elevated threat levels.

Automated tasks and systems which typically operate within an enterprisesystem may be included with additional processing typically performed inthe Cloud to determine as quickly as possible when an attack orunauthorized intrusion has been made. Because the primaryenterprise/computer system being protected is generally considered morevulnerable to threats than functionality operating in the Cloud,processes according to aspects of the invention may move activity datato the servers in the Cloud as quickly as possible through the Internetin order to take advantage of this additional safety factor. A criticalthreat produces a high-priority alert so that responsible systemadministrators are immediately notified and can act quickly to mitigatepotential damage from the identified threat. In one embodiment, securityadministrators are notified by a text to their cell phone through acellular infrastructure in order to alert them as quickly as possibleand further to provide a unique audible tone that is specificallyassociated with highly critical alerts.

FIG. 10 is a block diagram of a data processing system, which may beused with one embodiment of the invention. For example, the system 3000may be used as part of the cloud service 212 as shown in FIG. 2. Notethat while FIG. 10 illustrates various components of a computer system,it is not intended to represent any particular architecture or manner ofinterconnecting the components; as such details are not germane to theinvention. It will also be appreciated that network computers, handheldcomputers, mobile devices (e.g., smartphones, tablets) and other dataprocessing systems which have fewer components or perhaps morecomponents may also be used with the invention. The system 3000 of FIG.10 may, for example, be a client or a server.

As shown in FIG. 10, the system 3000, which is a form of a dataprocessing system, includes a bus or interconnect 3002 which is coupledto one or more microprocessors 3003 and a ROM 3007, a volatile RAM 3005,and a non-volatile memory 3006. The microprocessor 3003 is coupled tocache memory 3004. The bus 3002 interconnects these various componentstogether and also interconnects these components 3003, 3007, 3005, and3006 to a display controller and display device 3008, as well as toinput/output (I/O) devices 3010, which may be mice, keyboards, modems,network interfaces, printers, and other devices which are well-known inthe art.

Typically, the input/output devices 3010 are coupled to the systemthrough input/output controllers 3009. The volatile RAM 3005 istypically implemented as dynamic RAM (DRAM) which requires powercontinuously in order to refresh or maintain the data in the memory. Thenon-volatile memory 3006 is typically a magnetic hard drive, a magneticoptical drive, an optical drive, or a DVD RAM or other type of memorysystem which maintains data even after power is removed from the system.Typically, the non-volatile memory will also be a random access memory,although this is not required.

While FIG. 10 shows that the non-volatile memory is a local devicecoupled directly to the rest of the components in the data processingsystem, a non-volatile memory that is remote from the system may beutilized, such as, a network storage device which is coupled to the dataprocessing system through a network interface such as a modem orEthernet interface. The bus 3002 may include one or more buses connectedto each other through various bridges, controllers, and/or adapters, asis well-known in the art. In one embodiment, the I/O controller 3009includes a Universal Serial Bus (USB) adapter for controlling USBperipherals. Alternatively, I/O controller 3009 may include an IEEE-1394adapter, also known as FireWire adapter, for controlling FireWiredevices.

In the foregoing specification, embodiments of the invention have beendescribed with reference to specific exemplary embodiments thereof. Itwill be evident that various modifications may be made thereto withoutdeparting from the broader spirit and scope of the invention as setforth in the following claims. The specification and drawings are,accordingly, to be regarded in an illustrative sense rather than arestrictive sense.

1.-22. (canceled)
 23. A cyber security threat detection system operatingwithin a computing environment, the system comprising: one or morecollector engines operating at least in part within a computingenvironment, and configured to acquire behavioral activity informationover a period of time; a prediction engine operating on the acquiredbehavioral activity information, and configured to predict expectedbehavioral activity based on historic behavioral activity from therecorded behavioral activity information, to compare new behavioralactivity with the expected behavioral activity, and to determine aprobability of occurrence of the new behavioral activity based on thecomparison; an analytics engine configured to generate a security risklevel based on the probability of occurrence of the new behavioralactivity; and an alerting engine configured to issue one or more alertsin response to a determination that the security risk level has exceededa risk threshold.
 24. The cyber security threat detection system ofclaim 23, wherein the probability of occurrence of the new behavioralactivity increases if the new behavioral activity substantially behavesin accordance with the expected behavioral activity.
 25. The cybersecurity threat detection system of claim 23, wherein the probability ofoccurrence of the new behavioral activity decreases if the newbehavioral activity diverges from the expected behavioral activity,thereby indicating a possible security breach.
 26. The cyber securitythreat detection system of claim 23, wherein a lower probability ofoccurrence of the new behavioral activity indicates a greater securityrisk level, and vice versa.
 27. The cyber security threat detectionsystem of claim 23, wherein each of the one or more collector engines isinstalled on an endpoint operating within the computing environment. 28.The cyber security threat detection system of claim 23, wherein theprobability of occurrence of the new behavioral activity is combinedwith additional metrics to derive an overall security risk level. 29.The cyber security threat detection system of claim 23, wherein thecomputing environment includes one or more operations in a cloudservice.
 30. The cyber security threat detection system of claim 24,wherein a probability of breach decreases if the new behavioral activitysubstantially behaves in accordance with the expected behavioralactivity.
 31. The cyber security threat detection system of claim 30,wherein the probability of breach increases if the new behavioralactivity diverges from the expected behavioral activity.
 32. A cybersecurity threat detection system operating within a computingenvironment, the system comprising: one or more collector enginesoperating at least in part within a computing environment, andconfigured to acquire behavioral activity information over a period oftime; a prediction engine operating on the acquired behavioral activityinformation, and configured to predict expected behavioral activitybased on historic behavioral activity from the recorded behavioralactivity information, to compare new behavioral activity with theexpected behavioral activity, and to determine whether an activity witha high probability of occurrence from the new behavioral activity isabsent based on the comparison; an analytics engine configured togenerate a security risk level based on the determination whether theexpected activity is absent; and an alerting engine configured to issueone or more alerts in response to a determination that the security risklevel has exceeded a risk threshold.
 33. The cyber security threatdetection system of claim 32, wherein the security risk level increasesin response to a determination that the activity with the highprobability of occurrence from the new behavioral activity is absent.34. The cyber security threat detection system of claim 32, wherein eachof the one or more collector engines is installed on an endpointoperating within the computing environment.
 35. The cyber securitythreat detection system of claim 33, wherein the absent activityincludes a service normally present within the computing environment,but has suddenly disappeared, has become disabled, or is not operating.36. The cyber security threat detection system of claim 33, wherein theabsent activity includes an absence of a metric.
 37. The cyber securitythreat detection system of claim 36, wherein the absence of a metricincreases a probability of abnormal behavior and a weighted risk levelassociated the metric.
 38. The cyber security threat detection system ofclaim 32, wherein the computing environment includes one or moreoperations in a cloud service.
 39. A computer-implemented method forcyber security threat detection, the method implemented by one or moreprocessors operating within a computing environment, the methodcomprising: receiving behavioral activity information that has beenacquired over a period of time; operating on the received behavioralactivity information to predict expected behavioral activity based onhistoric behavioral activity from the received behavioral activityinformation; and determining a probability of occurrence of newbehavioral activity based on a comparison of the new behavioral activitywith the expected behavioral activity.
 40. The method of claim 39,wherein the comparison determines activity deviations between the newbehavioral activity and the expected behavioral activity.
 41. The methodof claim 39, further comprising: generating a security risk level basedon the probability of occurrence of the new behavioral activity.
 42. Themethod of claim 40, wherein determining the probability of occurrence ofnew behavioral activity comprises increasing the probability ofoccurrence if the new behavioral activity substantially behaves inaccordance with the expected behavioral activity.
 43. The method ofclaim 40, wherein determining the probability of occurrence of newbehavioral activity comprises decreasing the probability of occurrenceif the new behavioral activity diverges from the expected behavioralactivity.
 44. The method of claim 41, wherein a lower probability ofoccurrence of the new behavioral activity indicates a greater securityrisk level, and vice versa.
 45. The method of claim 41, whereinoperating on the received behavioral activity information to predictexpected behavioral activity comprises predicting an operation patternthat an application is expected to follow using the historic behavioralactivity, and the probability of occurrence of new behavioral activityis a determined probability that the application follows the predictedoperation pattern.
 46. The method of claim 45, wherein the security risklevel is generated based on an amount of diversion from the predictedoperation pattern.
 47. The method of claim 46, further comprisingassigning a weighted risk value to each successive diversion from thepredicted operation pattern.
 48. The method of claim 39, wherein thereceived behavioral activity information is collected from one or moreendpoints operating within the computing environment.
 49. The method ofclaim 39, wherein the computing environment includes one or moreoperations in a cloud service.